99 research outputs found

    Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fall with Grace

    Full text link
    Despite impressive results using reinforcement learning to solve complex problems from scratch, in robotics this has still been largely limited to model-based learning with very informative reward functions. One of the major challenges is that the reward landscape often has large patches with no gradient, making it difficult to sample gradients effectively. We show here that the robot state-initialization can have a more important effect on the reward landscape than is generally expected. In particular, we show the counter-intuitive benefit of including initializations that are unviable, in other words initializing in states that are doomed to fail.Comment: Proceedings of the 2018 IEEE International Conference on SImulation, Modeling and Programming for Autonomous Robots (SIMPAR), Brisbane, Australia, 16-19 201

    Beyond Basins of Attraction: Quantifying Robustness of Natural Dynamics

    Full text link
    Properly designing a system to exhibit favorable natural dynamics can greatly simplify designing or learning the control policy. However, it is still unclear what constitutes favorable natural dynamics and how to quantify its effect. Most studies of simple walking and running models have focused on the basins of attraction of passive limit-cycles and the notion of self-stability. We instead emphasize the importance of stepping beyond basins of attraction. We show an approach based on viability theory to quantify robust sets in state-action space. These sets are valid for the family of all robust control policies, which allows us to quantify the robustness inherent to the natural dynamics before designing the control policy or specifying a control objective. We illustrate our formulation using spring-mass models, simple low dimensional models of running systems. We then show an example application by optimizing robustness of a simulated planar monoped, using a gradient-free optimization scheme. Both case studies result in a nonlinear effective stiffness providing more robustness.Comment: 15 pages. This work has been accepted to IEEE Transactions on Robotics (2019

    Viability in State-Action Space: Connecting Morphology, Control, and Learning

    Get PDF
    Wie können wir Robotern ermöglichen, modellfrei und direkt auf der Hardware zu lernen? Das maschinelle Lernen nimmt als Standardwerkzeug im Arsenal des Robotikers seinen Platz ein. Es gibt jedoch einige offene Fragen, wie man die Kontrolle ĂŒber physikalische Systeme lernen kann. Diese Arbeit gibt zwei Antworten auf diese motivierende Frage. Das erste ist ein formales Mittel, um die inhĂ€rente Robustheit eines gegebenen Systemdesigns zu quantifizieren, bevor der Controller oder das Lernverfahren entworfen wird. Dies unterstreicht die Notwendigkeit, sowohl das Hardals auch das Software-Design eines Roboters zu berĂŒcksichtigen, da beide Aspekte in der Systemdynamik untrennbar miteinander verbunden sind. Die zweite ist die Formalisierung einer Sicherheitsmass, die modellfrei erlernt werden kann. Intuitiv zeigt diese Mass an, wie leicht ein Roboter FehlschlĂ€ge vermeiden kann. Auf diese Weise können Roboter unbekannte Umgebungen erkunden und gleichzeitig AusfĂ€lle vermeiden. Die wichtigsten BeitrĂ€ge dieser Dissertation basieren sich auf der ViabilitĂ€tstheorie. ViabilitĂ€t bietet eine alternative Sichtweise auf dynamische Systeme: Anstatt sich auf die Konvergenzeigenschaften eines Systems in Richtung Gleichgewichte zu konzentrieren, wird der Fokus auf Menge von FehlerzustĂ€nden und die FĂ€higkeit des Systems, diese zu vermeiden, verlagert. Diese Sichtweise eignet sich besonders gut fĂŒr das Studium der Lernkontrolle an Robotern, da StabilitĂ€t im Sinne einer Konvergenz wĂ€hrend des Lernprozesses selten gewĂ€hrleistet werden kann. Der Begriff der ViabilitĂ€t wird formal auf den Zustand-Aktion-Raum erweitert, mit ViabilitĂ€tsmengen von Staat-Aktionspaaren. Eine ĂŒber diese Mengen definierte Mass ermöglicht eine quantifizierte Bewertung der Robustheit, die fĂŒr die Familie aller fehlervermeidenden Regler gilt, und ebnet den Weg fĂŒr ein sicheres, modellfreies Lernen. Die Arbeit beinhaltet auch zwei kleinere BeitrĂ€ge. Der erste kleine Beitrag ist eine empirische Demonstration der Shaping durch ausschliessliche Modifikation der Systemdynamik. Diese Demonstration verdeutlicht die Bedeutung der Robustheit gegenĂŒber Fehlern fĂŒr die Lernkontrolle: AusfĂ€lle können nicht nur SchĂ€den verursachen, sondern liefern in der Regel auch keine nĂŒtzlichen Gradienteninformationen fĂŒr den Lernprozess. Der zweite kleine Beitrag ist eine Studie ĂŒber die Wahl der Zustandsinitialisierungen. Entgegen der Intuition und der ĂŒblichen Praxis zeigt diese Studie, dass es zuverlĂ€ssiger sein kann, das System gelegentlich aus einem Zustand zu initialisieren, der bekanntermassen unkontrollierbar ist.How can we enable robots to learn control model-free and directly on hardware? Machine learning is taking its place as a standard tool in the roboticist’s arsenal. However, there are several open questions on how to learn control for physical systems. This thesis provides two answers to this motivating question. The first is a formal means to quantify the inherent robustness of a given system design, prior to designing the controller or learning agent. This emphasizes the need to consider both the hardware and software design of a robot, which are inseparably intertwined in the system dynamics. The second is the formalization of a safety-measure, which can be learned model-free. Intuitively, this measure indicates how easily a robot can avoid failure, and enables robots to explore unknown environments while avoiding failures. The main contributions of this dissertation are based on viability theory. Viability theory provides a slightly unconventional view of dynamical systems: instead of focusing on a system’s convergence properties towards equilibria, the focus is shifted towards sets of failure states and the system’s ability to avoid these sets. This view is particularly well suited to studying learning control in robots, since stability in the sense of convergence can rarely be guaranteed during the learning process. The notion of viability is formally extended to state-action space, with viable sets of state-action pairs. A measure defined over these sets allows a quantified evaluation of robustness valid for the family of all failure-avoiding control policies, and also paves the way for enabling safe model-free learning. The thesis also includes two minor contributions. The first minor contribution is an empirical demonstration of shaping by exclusively modifying the system dynamics. This demonstration highlights the importance of robustness to failures for learning control: not only can failures cause damage, but they typically do not provide useful gradient information for the learning process. The second minor contribution is a study on the choice of state initializations. Counter to intuition and common practice, this study shows it can be more reliable to occasionally initialize the system from a state that is known to be uncontrollable

    Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

    Full text link
    Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.Comment: Accepted to the IEEE International Conference on Robotics and Automation (ICRA) 2018, 6 pages, 6 figure

    Learning Fast and Precise Pixel-to-Torque Control

    Full text link
    In the field, robots often need to operate in unknown and unstructured environments, where accurate sensing and state estimation (SE) becomes a major challenge. Cameras have been used to great success in mapping and planning in such environments, as well as complex but quasi-static tasks such as grasping, but are rarely integrated into the control loop for unstable systems. Learning pixel-to-torque control promises to allow robots to flexibly handle a wider variety of tasks. Although they do not present additional theoretical obstacles, learning pixel-to-torque control for unstable systems that that require precise and high bandwidth control still poses a significant practical challenge, and best practices have not yet been established. To help drive reproducible research on the practical aspects of learning pixel-to-torque control, we propose a platform that can flexibly represent the entire process, from lab to deployment, for learning pixel-to-torque control on a robot with fast, unstable dynamics: the vision-based Furuta pendulum. The platform can be reproduced with either off-the-shelf or custom-built hardware. We expect that this platform will allow researchers to quickly and systematically test different approaches, as well as reproduce and benchmark case studies from other labs. We also present a first case study on this system using DNNs which, to the best of our knowledge, is the first demonstration of learning pixel-to-torque control on an unstable system with update rates faster than 100 Hz. A video synopsis can be found online at https://youtu.be/S2llScfG-8E, and in the supplementary material.Comment: video: https://www.youtube.com/watch?v=S2llScfG-8E 9 pages. Published in Robotics and Automation Magazin

    On Designing an Active Tail for Legged Robots: Simplifying Control via Decoupling of Control Objectives

    Get PDF
    This work explores the possible roles of active tails for steady-state legged-locomotion. A series of simple models are proposed which capture the dynamics of an idealized running system with an active tail. The models suggest that the control objectives of injecting energy into the system and stabilizing body-pitch can be effectively decoupled via proper tail design: a long, light tail. Thus the overall control problem can be simplified, using the tail exclusively to stabilize body-pitch: this effectively relaxes the constraints on the leg-actuators, allowing them to be recruited specifically for adding energy into the system. We show in simulation that models with long-light tails are better able to reject perturbations to body-pitch than short-heavy tails with the same moment of inertia. Further, we present the results of a one degree-of-freedom tail mounted on the open-loop controlled quadruped robot Cheetah-Cub. Our results show that an active tail can greatly improve both forward velocity and reduce body-pitch per stride, while adding minimal complexity. Further, the results validate the long-light tail design: shorter, heavier tails are much more sensitive to configuration and control parameter changes than longer and lighter tails with the same moment of inertia

    Systematic review and meta-analysis of the diagnostic accuracy of ultrasonography for deep vein thrombosis

    Get PDF
    Background Ultrasound (US) has largely replaced contrast venography as the definitive diagnostic test for deep vein thrombosis (DVT). We aimed to derive a definitive estimate of the diagnostic accuracy of US for clinically suspected DVT and identify study-level factors that might predict accuracy. Methods We undertook a systematic review, meta-analysis and meta-regression of diagnostic cohort studies that compared US to contrast venography in patients with suspected DVT. We searched Medline, EMBASE, CINAHL, Web of Science, Cochrane Database of Systematic Reviews, Cochrane Controlled Trials Register, Database of Reviews of Effectiveness, the ACP Journal Club, and citation lists (1966 to April 2004). Random effects meta-analysis was used to derive pooled estimates of sensitivity and specificity. Random effects meta-regression was used to identify study-level covariates that predicted diagnostic performance. Results We identified 100 cohorts comparing US to venography in patients with suspected DVT. Overall sensitivity for proximal DVT (95% confidence interval) was 94.2% (93.2 to 95.0), for distal DVT was 63.5% (59.8 to 67.0), and specificity was 93.8% (93.1 to 94.4). Duplex US had pooled sensitivity of 96.5% (95.1 to 97.6) for proximal DVT, 71.2% (64.6 to 77.2) for distal DVT and specificity of 94.0% (92.8 to 95.1). Triplex US had pooled sensitivity of 96.4% (94.4 to 97.1%) for proximal DVT, 75.2% (67.7 to 81.6) for distal DVT and specificity of 94.3% (92.5 to 95.8). Compression US alone had pooled sensitivity of 93.8 % (92.0 to 95.3%) for proximal DVT, 56.8% (49.0 to 66.4) for distal DVT and specificity of 97.8% (97.0 to 98.4). Sensitivity was higher in more recently published studies and in cohorts with higher prevalence of DVT and more proximal DVT, and was lower in cohorts that reported interpretation by a radiologist. Specificity was higher in cohorts that excluded patients with previous DVT. No studies were identified that compared repeat US to venography in all patients. Repeat US appears to have a positive yield of 1.3%, with 89% of these being confirmed by venography. Conclusion Combined colour-doppler US techniques have optimal sensitivity, while compression US has optimal specificity for DVT. However, all estimates are subject to substantial unexplained heterogeneity. The role of repeat scanning is very uncertain and based upon limited data

    The state of the Martian climate

    Get PDF
    60°N was +2.0°C, relative to the 1981–2010 average value (Fig. 5.1). This marks a new high for the record. The average annual surface air temperature (SAT) anomaly for 2016 for land stations north of starting in 1900, and is a significant increase over the previous highest value of +1.2°C, which was observed in 2007, 2011, and 2015. Average global annual temperatures also showed record values in 2015 and 2016. Currently, the Arctic is warming at more than twice the rate of lower latitudes
